Context-based Quasi-Synonym Extraction
نویسندگان
چکیده
Nearor quasi-synonyms can have an important role in the task of Information Retrieval since they may help to address the problem of vocabulary mismatch between queries and documents. One current approach to generating quasi-synonyms uses a general similarity measure to score the synonymy of two words based on their context vectors. In contrast, we compare two simple measures that take into account more directly the contextual evidence that supports a synonymy relation. Experimental results using the Google n-gram collection show that our methods produce better synonyms than existing approaches.
منابع مشابه
Practice in Synonym Extraction at Large Scale
Synonym extraction is an important task in natural language processing and often used as a submodule in query expansion, question answering and other applications. Automatic synonym extractor is highly preferred for large scale applications. Previous studies in synonym extraction are most limited to small scale datasets. In this paper, we build a large dataset with 3.4 million synonym/nonsynony...
متن کاملAutomatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature
We present a method for the extraction of synonyms for German particle verbs based on a word-aligned German-English parallel corpus: by translating the particle verb to a pivot, which is then translated back, a set of synonym candidates can be extracted and ranked according to the respective translation probabilities. In order to deal with separated particle verbs, we apply re-ordering rules to...
متن کاملBiomedical Semantics in the Big Data Era
1 Doing-Harris K, Livnat Y, Meystre S Automated concept and relationship extraction for the Semi-Automated Ontology Management (SEAM) System Journal of Biomedical Semantics 2015, 6:15 doi:10.1186/s13326 -015-0011-7 Ontology; Natural language processing; Terminology extraction Background: We develop medical-specialty specific ontologies that contain the settled science and common term usage. We ...
متن کاملSelf-Supervised Synonym Extraction from the Web
Current synonym extraction methods work in a “closed” way. Given the problem word and set of target words, researchers have to choose words synonymous with the problem word using features such as lexical patterns and distributional similarities. This paper tries to discover synonyms in an “open” way and presents a synonym extraction framework based on self-supervised learning. We first analysis...
متن کاملFinding High-Frequent Synonyms of A Domain-Specific Verb in English Sub-Language of MEDLINE Abstracts Using WordNet
The task of binary relation extraction in IE [3] is based mainly on high-frequent verbs and patterns. During the extraction of a specific relation from MEDLINE English abstracts, it is noticed that besides the high-frequent verb itself which represents the specific relation, some other word forms, such as the nominal and adjective forms of this verb, as well as its synonyms, also play a very im...
متن کامل